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qq ■ Abstract. Astro-WISE is a scientific information system for the data processing of 

\ optical images. In this paper we review main features of Astro-WISE and describe the 

current status of the system. 



^ ■ 1. Introduction 

i ' 

Astro-WISE Q dValentijn et al.ll2007h stands for Astronomical Wide-field Imaging Sys 



tern for Europe. The system was initially developed to supp ort the data processing of 
the Kilo Degree Survey (KiDS) dVerdoes Kleijn et al J 12012 ) on the VLT (Very Large 



Telescope) Survey Telescope (VSlQ). The VST is a 2.61 m diameter imaging telescope 
installed on the Paranal Observatory in Chile. The instrument installed on the telescope 
is OmegaCAM, a large format (16k x 16k) CCD camera which will produce up to 15 
I/-} | TB of raw images per year. This amount is multiplied at least by a factor of 3 by the 

data processing. 

Astro-WISE was planned as a storage and data processing system for KiDS, but, 
with time, grew up to a general astronomical information system. It has developed into 
a much wider data processing information system which can be used in many other 
disciplines. The idea behind Astro-WISE is to have the data model, data and data 
processing in a single system. 

At the same time, such a system should be shared by a number of institutes and 
sites, as the scientific community working with the data in the system is widely dis- 
^ ■ tributed and each group is coming to the system with their own resources (data storage 

and computing facilities). Moreover, each group is developing a part of the software 
implemented in the system. The users must be able to form communities (groups) to 
process the data for the same instrument or project. 

2. Basic Principles of the System 

The development of the Astro-WISE information system started from the very practical 
challenge: enable a community of researchers distributed over the world to process the 



1 http://www.astro-wise.org 

2 http://www.eso.org/public/teles-instr/surveytelescopes/vst.html 
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data of the OmegaCAM 256 Megapixel camera imaging survey. These scientists should 
be able to evaluate the quality of the data, apply a number of calibrations, share the data 
in the team and employ distributed resources of PB scale of data storage and Tflops 
capacity in data processing. 

From this our basic requirements on the system are derived: 

* Scalability of the system: any part of the system, i.e. data storage, data pro- 
cessing, metadata management, should be scalable with the increase of incoming 
data and a number of users involved in the data processing. The system should be 
scalable to the data processing algorithms and pipelines allowing to implement 
new pipelines and reprocess the same data with new algorithms. The scalability 
to the data mining should be implemented, i.e., the system should provide all 
possible ranges of requests from the retrieval of a single data item by identifier to 
a complicated archive study involving multiple complex queries. 

* Distributed system: the system allows any activity to be distributed among dif- 
ferent users and different sites where the system is implemented. 

* Traceability: all activity in the system should leave a clear footprint so that it will 
be possible to trace the origin of any changes in the data and find an algorithm, 
program and user who created a data item. 

* Adaptability. The system allows for a number of different scientific use-cases and 
provides resources, pipelines and expertise to perform data processing according 
to the user's interests. 

Requirements were set on the common data model realized in the system. The 
common data model is the core of Astro- WISE and implements the following features: 

1. Inheritance of data objects. Using object oriented programming, all objects 
within the system can inherit key properties of the parent object, all these prop- 
erties are made persistent (i.e., stored in a database). 

2. Full lineage. The linking (associations or references, or joins) between object 
instances in the database is maintained completely. Each data item in the system 
can be traced back to its origin. The tracing of the data object can be both forward 
and backward, for example, this makes it possible to find which raw frames were 
used to determine magnitudes, shapes and position for this particular source and, 
at the same time, which sources were extracted on the particular raw frame. 

3. Consistency. At each processing step, all processing parameters and the inputs 
which are used, are kept within the system. Astro-WISE can keep the previous 
versions of all data items along with all parameters used to produce them and all 
dependencies between objects. 

4. Embarrassingly parallel and distributed processing, the administration of asyn- 
chronous processing is naturally recorded in the metadata layer 

The programming of both Astro-WISE pipelines and also programs employs: 

1. Component based software engineering (CBSE). This is a modular approach 
to software development, each module can be developed independently and wrapped 
in the base language of the system (Python) to form a pipeline or workflow. 
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Figure 1 . Astro-WISE architecture. The Groningen node is shown. The node con- 
sistst of a metadata database (Oracle RAC llg), computing element (DPU submit- 
ting jobs to HPC in Donald Smits Center for Information Technology), data storage 
(Astro-WISE dataservers) and a number of web services. 



2. An object-oriented common data model used throughout the system. This 
means that each module, application and pipeline will deal with the unified data 
model for the whole cycle of the data processing from the raw data to the final 
data product. 

3. Persistence of all the data model objects. Each data product in the data pro- 
cessing chain is described as an object of a certain class and saved in the archive 
of the specific project along with the parameters used for data processing. 



3. Realization and Current Status 

Astro-WISE is a federated system distributed over Europe. Currently nodes of Astro- 
WISE are installed in the Groningen University, Radboud University Neijmegen, Lei- 
den University (The Netherlands), Bonn Argelander-lnstitut fur Astronomy, Univer- 
sitats-Sternwarte Miinchen (Germany) and the Observatorio Astronomico di Capodi- 
monte (Italy). Each data item in Astro-WISE is stored in a file with an unique filename 
registered in metadata database. Astro-WISE allows users to share data in projects 
forming groups responsible for the processing of data for a particular instrument or 
survey. For example, Astro-WISE is used for KiDS, the Coma Legacy Surve}0 and 
various WFI surveys. 

An Astro-WISE node in full deployment consists of a metadata database, comput- 
ing element, data storage element (dataservers), web services and an Astro-WISE user 
environment. The data processing can be submitted to one of the Astro-WISE com- 
puting elements or BiG Grid processing element via the Distributed Processing Unit 
(DPU). 



3 http ://w w w. astro- wise, org/proj ects/COM ALS/ 
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Each data item in Astro-WISE is an object of a corresponding class, for example, 
all reduced images are objects of the ReducedScienceFrame class and keep links to 
the scientific images as well as the calibration images used for the data processing, 
processing parameters and quality control parameters. This information is stored in the 
metadata part of the data item and available to the user according to the user's access 
rights. It is possible to use a number of Astr o-WISE web services to p rocess the data 
or to verify the quality of the processed data dBuddelmeijer et al.1l2012h . In addition to 
web services the user can write a Python script to build his own pipeline from a library 
of pre-defined methods and classes for the data processing. The processed data will be 
stored as private data of the user till the user decides to share it within the project or to 
publish it in Astro-WISE or Virtual Observatory. 

The Astro-WISE system supports data processing from raw images not only for 
OmegaCAM but for a wide range of instruments, including WFC, WFI, Megacam, 
SuprimeCAM, Large Binocular Camera and others 0. Astro-WISE provides users with 
a number of catalogs and surveys, including USN0-B1, 2MASS PSC, SDSS DR7, 
UKIDSS DR3 and others, which can be combined with the data produced by the user 
in a new data product. 

In 201 1 Astro-WISE has a storage capacity of 1.6 PB of data and processing capac- 
ity of 10 Tflops distributed over Astro-WISEpartners. In addition users of Astro-WISE 
can employ processing elements of BiGGriqj. 

In the recent years in addition to the data processing of the KiDS survey the 
Astro-WI SE concept was use d to develop information system sfor LOFAR Long-Term 



Archive (Belikov et ai 



system (IPizagno et al 



2012) and Multi Unit Spectroscopic Explorer data processing 
l2012h . 
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